Mapping of Sequence Reads to the Reference Genomes    ◾    67

The QNAME field is to show the read sequence name, which is obtained from the FASTQ

file. FLAG is a bitwise integer that describes the alignments (e.g., paired, unaligned, and

duplicate). The description is stored as codes as shown in Table 2.3.

The FLAG field in the SAM file may have one of the decimal values listed in Table 2.3

or the sum of any of those decimal values. For instance, Figure 2.16 shows the FLAG for

the read “SRR062634.6862698” is 99, which means that this FLAG combines 4 conditions:

1+2+32+64=99. This means that the read maps to that position of the reference genome are

described as follows: the read is paired (1), the aligner mapped the two pairs properly (2),

the next sequence (SEQ) is a reverse strand (32), and first read in the pair (64). Instead of

doing mental math to figure out a such FLAG number, we can use “samtools flags” com-

mand as follows (Figure 2.16):

samtools flags 99

samtools flags

The RNAME field shows the reference sequence name of the alignment such as a chromo-

some name (e.g., 1). This field will be filled with “*” for unmapped read.

TABLE 2.2  Eleven Mandatory Fields of the Alignment Section of the SAM File

Column

Field

Description

1

QNAME

The query sequence name (string)

2

FLAG

Bitwise flag (integer)

3

RNAME

Reference sequence name (string)

4

POS

Mapping position from the leftmost first base (integer)

5

MAPQ

Mapping quality (integer)

6

CIGAR

CIGAR string (string)

7

RNEXT

Reference name of the mate or next read (string)

8

PNEXT

Position of the mate or next read (integer)

9

TLEN

Sequence length (integer)

10

SEQ

The read sequence (string)

11

QUAL

ASCII code of the Phred-scaled base (Phred+33) (string)

FIGURE 2.15  An alignment section of a SAM file.